A Fast Approximation Scheme for Probabilistic Wavelet Synopses

نویسندگان

  • Antonios Deligiannakis
  • Minos N. Garofalakis
  • Nick Roussopoulos
چکیده

Several studies have demonstrated the effectiveness of Haar wavelets in reducing large amounts of data down to compact wavelet synopses that can be used to obtain fast, accurate approximate query answers. While Haar wavelets were originally designed for minimizing the overall root-mean-squared (i.e., L2-norm) error in the data approximation, the recently-proposed idea of probabilistic wavelet synopses also enables their use in minimizing other error metrics, such as the relative error in individual datavalue reconstruction, which is arguably the most important for approximate query processing. Known construction algorithms for probabilistic wavelet synopses employ probabilistic schemes for coefficient thresholding that are based on optimal Dynamic-Programming (DP) formulations over the error-tree structure for Haar coefficients. Unfortunately, these (exact) schemes can scale quite poorly for large data-domain and synopsis sizes. To address this shortcoming, in this paper, we introduce a novel, fast approximation scheme for building probabilistic wavelet synopses over large data sets. Our algorithm’s running time is near-linear in the size of the data-domain (even for very large synopsis sizes) and proportional to 1/ , where is the desired approximation guarantee. The key technical idea in our approximation scheme is to make exact DP formulations for probabilistic thresholding much “sparser”, while ensuring a maximum relative degradation of on the quality of the approximate synopsis, i.e., the desired approximation error metric. Extensive experimental results over synthetic and real-life data clearly demonstrate the benefits of our proposed techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Probabilistic Wavelet Synopses for Multiple Measures

The recently proposed idea of probabilistic wavelet synopses has enabled their use as a tool for reducing large amounts of data down to compact wavelet synopses that can be used to obtain fast, accurate approximate answers to user queries, while at the same time providing guarantees on the accuracy of individual answers. Relatively little attention, however, has been paid to the problem of usin...

متن کامل

Constructing Optimal Wavelet Synopses

The wavelet decomposition is a proven tool for constructing concise synopses of massive data sets and rapid changing data streams, which can be used to obtain fast approximate, with accuracy guarantees, answers. In this work we present a generic formulation for the problem of constructing optimal wavelet synopses under space constraints for various error metrics, both for static and streaming d...

متن کامل

Linked Bernoulli Synopses: Sampling along Foreign Keys

Random sampling is a popular technique for providing fast approximate query answers, especially in data warehouse environments. Compared to other types of synopses, random sampling bears the advantage of retaining the dataset’s dimensionality; it also associates probabilistic error bounds with the query results. Most of the available sampling techniques focus on table-level sampling, that is, t...

متن کامل

Workload-Based Wavelet Synopses

This paper introduces workload-based wavelet synopses, which exploit query workload information to significantly boost accuracy in approximate query processing. We show that wavelet synopses can adapt effectively to workload information, and that they have significant advantages over previous approaches. An important aspect of our approach is optimizing synopses constructions toward error metri...

متن کامل

AN ADAPTIVE WAVELET SOLUTION TO GENERALIZED STOKES PROBLEM

In this paper we will present an adaptive wavelet scheme to solvethe generalized Stokes problem. Using divergence free wavelets, theproblem is transformed into an equivalent matrix vector system, thatleads to a positive definite system of reduced size for thevelocity. This system is solved iteratively, where the applicationof the infinite stiffness matrix, that is sufficiently compressible,is r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005